swapping autoencoder
Swapping Autoencoder for Deep Image Manipulation
Deep generative models have become increasingly effective at producing realistic images from randomly sampled seeds, but using such models for controllable manipulation of existing images remains challenging. We propose the Swapping Autoencoder, a deep model designed specifically for image manipulation, rather than random sampling. The key idea is to encode an image into two independent components and enforce that any swapped combination maps to a realistic image. In particular, we encourage the components to represent structure and texture, by enforcing one component to encode co-occurrent patch statistics across different parts of the image. As our method is trained with an encoder, finding the latent codes for a new input image becomes trivial, rather than cumbersome. As a result, our method enables us to manipulate real input images in various ways, including texture swapping, local and global editing, and latent code vector arithmetic. Experiments on multiple datasets show that our model produces better results and is substantially more efficient compared to recent generative models.
Review for NeurIPS paper: Swapping Autoencoder for Deep Image Manipulation
The main idea of this paper is disentangling structure and texture by using an auto-encoder like structure. However, it is not a new idea and indeed studied in many previous methods. Though the authors try to differentiate this method to existing methods from the supervise/unsupervise aspect in the related work part, it is still not technically impressive to me. Moreover, there is no comparison to these disentangle methods. Maybe these methods cannot be directly applied to the tasks mentioned in this paper, but I think it should not be difficult.
Review for NeurIPS paper: Swapping Autoencoder for Deep Image Manipulation
This is not acceptable and some papers have been desk-rejected for doing the same thing. These additional results have to be moved elsewhere.* The reviewers' opinions on this paper diverge even after considering the rebuttal and discussing. The meta-review is thus unusually detailed. The paper proposes an approach to image editing by disentangling the structure and texture using an autoencoder with the latent space decomposed into two parts - corresponding to texture and structure.
Swapping Autoencoder for Deep Image Manipulation
Deep generative models have become increasingly effective at producing realistic images from randomly sampled seeds, but using such models for controllable manipulation of existing images remains challenging. We propose the Swapping Autoencoder, a deep model designed specifically for image manipulation, rather than random sampling. The key idea is to encode an image into two independent components and enforce that any swapped combination maps to a realistic image. In particular, we encourage the components to represent structure and texture, by enforcing one component to encode co-occurrent patch statistics across different parts of the image. As our method is trained with an encoder, finding the latent codes for a new input image becomes trivial, rather than cumbersome.
Latest Model That Might Replace GANs To Create Deepfakes
Recently, a team of researchers from UC Berkeley and Adobe Research proposed a new machine learning model known as Swapping Autoencoder, which has the capability to perform image manipulation. The key idea of this research is to encode a picture into 2 independent components and then enforce that any swapped combination maps to a realistic image. Deep generative models such as GANs or Generative Adversarial Networks and Variational Autoencoders (VAEs) have gained much traction by the researchers over the years. According to the researchers, deep generative models have become a popular technique when it comes to producing realistic images from randomly sampled data. However, such deep generative models face various challenges when used for a controllable manipulation of existing images.
Researchers detail texture-swapping AI that could be used to create deepfakes
In a preprint paper published on Arxiv.org, They claim it can modify any image in a variety ways, including texture swapping, while remaining "substantially" more efficient compared with previous generative models. The researchers acknowledge that their work could be used to create deepfakes, or synthetic media in which a person in an existing image or video is replaced with someone else's likeness. In a human perceptual study, subjects were fooled 31% of the time by images created using the Swapping Autoencoder. But they also say that proposed detectors can successfully spot images manipulated by the tool at least 73.9% of the time, suggesting the Swapping Autoencoder is no more harmful than other AI-powered image manipulation tools.